library(here)
library(stringr)
library(magrittr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(atlantisom)
Current atlantisom functions are designed primarily for
age structured modeled groups, because the package was originally scoped
to generate simulated data to test fishery stock assessments. It is now
clear that simulated data testing can also benefit other model types,
including multispecies models and food web models. Therefore, there is a
need to develop functions to develop simulated data for lower trophic
level biomass pools modeled in Atlantis and also included in food web
models such as Rpath.
Where is biomass information for non-age structured groups output? To be parallel to survey sampling for fish, it would be good to have it by model output timestep toutinc and polygon, but depth layer is not necessary at this time.
First check content of the .nc output file:
dir <- here("atlantisoutput/NOBA_sacc_30")
file_nc <- "nordic_runresults_01.nc"
file.nc <- file.path(dir, file_nc)
# Load ATLANTIS output!
at_out <- RNetCDF::open.nc(con = file.nc)
#on.exit(RNetCDF::close.nc(at_out))
# Get info from netcdf file! (Filestructure and all variable names)
var_names_ncdf <- sapply(seq_len(RNetCDF::file.inq.nc(at_out)$nvars - 1),
function(x) RNetCDF::var.inq.nc(at_out, x)$name)
n_timesteps <- RNetCDF::dim.inq.nc(at_out, 0)$length
n_boxes <- RNetCDF::dim.inq.nc(at_out, 1)$length
n_layers <- RNetCDF::dim.inq.nc(at_out, 2)$length
Which species are biomass pools, are they identified in the groups.csv?
No, biomass pools are defined as the combination of NumCohorts == 1 and GroupType not FISH, FISH_INVERT, SHARK, BIRD, MAMMAL, REPTILE. Biomass pools groups are among those below according to the Atlantis manual, ch 6. p 106:
BIOMASS POOL GROUPS
Phytoplankton
7) Small Phytoplankton SM_PHY
8) Large Phytoplankton LG_PHY (used in combination with IsSiliconDep to define diatoms)
9) Trichodesmium and Cyanobacteria TRICHO (N fixers)
Other primary producers
10) Microphtybenthos MICROPHTYBENTHOS
11) Dinoflagellates DINOFLAG
12) Phytoben PHYTOBEN (typically used to represent macroalgae)
13) Seagrass SEAGRASS
14) Turf TURF
Zooplankton
15) Small Zooplankton SM_ZOO
16) Medium Zooplankton MED_ZOO
17) Large Zooplankton LG_ZOO
18) Jellyfish JELLIES (in the past represented as LG_ZOO)
Large pelagic invetebrates
19)Cephalopod CEP
20) Prawns PWN
Infauna
21) Small Infauna SM_INF
22) Large Infauna LG_INF
Epibenthic organisms
23) Sediment epibenthic filter feeders SED_EP_FF (often used for bivalves, sponges)
24) Benthic grazers SED_EP_OTHER
25) Mobile epibenthos MOB_EP_OTHER (often used for crabs, lobster, octopus)
26) Corals CORAL
27)Sponges SPONGE
Bacteria
28) Pelagic Bacteria PL_BACT
29) Sediment Bacteria SED_BACT
Obligatory detritus groups
30) Carrion CARRION
31) Labile detritus LAB_DET
32) Refractory detritus REF_DET
33) Additional ice and land based groups are also available.
Ice dwelling
34) ICE_BACT
35) ICE_MIXOTROPHS
36) ICE_DIATOMS
37) ICE_ZOOBIOTA
Land dwelling (vegetation only for now)
38) MARSH
39) MANGROVE
For NOBA Atlantis, these are the biomass pools:
fgs <- atlantisom::load_fgs(here("atlantisoutput/NOBA_sacc_30"), "nordic_groups_v04.csv")
pools <- fgs %>%
dplyr::filter(NumCohorts==1) %>%
dplyr::select(Code, Name, Long.Name, NumCohorts, InvertType)
knitr::kable(pools)
| Code | Name | Long.Name | NumCohorts | InvertType |
|---|---|---|---|---|
| KCR | King_crab | Red king crab | 1 | LG_INF |
| ZG | Gel_zooplankton | Gelatinous zooplankton | 1 | LG_ZOO |
| ZL | Large_zooplankton | Large zooplankton | 1 | LG_ZOO |
| ZM | Medium_zooplankton | Medium zooplankton | 1 | MED_ZOO |
| ZS | Small_zooplankton | Small zooplankton | 1 | SM_ZOO |
| DF | Dinoflag | Dinoflagellates | 1 | DINOFLAG |
| PS | Small_phytop | Small phytoplankton | 1 | SM_PHY |
| PL | Large_phytop | Large phytoplankton | 1 | LG_PHY |
| BC | Predatory_ben | Predatory benthos | 1 | LG_INF |
| BD | Detrivore_ben | Detrivore benthos | 1 | LG_INF |
| BFF | Benthic_filterf | Benthic filter feeders | 1 | SED_EP_FF |
| SPO | Sponges | Sponges | 1 | SED_EP_FF |
| COR | Corals | Corals | 1 | SED_EP_FF |
| PB | Pel_bact | Pelagic bacteria | 1 | PL_BACT |
| BB | Ben_bact | Sediment bacteria | 1 | SED_BACT |
| DR | Ref_det | Refractory detritus | 1 | REF_DET |
| DL | Lab_det | Labile detritus | 1 | LAB_DET |
| DC | Carrion | Carrion | 1 | CARRION |
Here are the relevant outputs in the output.nc file for these groups:
# return everything in var_names_ncdf matching Name in pools
var_names_ncdf[str_detect(var_names_ncdf, str_c(pools$Name, collapse ="|"))]
## [1] "Large_phytop_N" "Small_phytop_N" "Dinoflag_N"
## [4] "King_crab_N" "Gel_zooplankton_N" "Large_zooplankton_N"
## [7] "Medium_zooplankton_N" "Small_zooplankton_N" "Pel_bact_N"
## [10] "Ben_bact_N" "Detrivore_ben_N" "Predatory_ben_N"
## [13] "Ref_det_N" "Lab_det_N" "Carrion_N"
## [16] "Benthic_filterf_N" "Sponges_N" "Corals_N"
## [19] "Benthic_filterf_Cover" "Sponges_Cover" "Corals_Cover"
So we want “N” nitrogen output. In theory this can be achieved with
atlantisom::load_ncwith select_variable = "N".
Looks like it works, output is N for our selected biomass pool species
over all polygons, layers, and times. Output of
unique(testNpools$species):
select_groups <- pools$Name
nc_out <- file_nc
bps <- load_bps(dir = dir, fgs = "nordic_groups_v04.csv", file_init = "nordic_biol_v23.nc")
# Get the boundary boxes
allboxes <- load_box(dir = dir, file_bgm = "Nordic02.bgm")
boxes <- get_boundary(allboxes)
testNpools <- load_nc(dir = dir,
file_nc = nc_out,
bps = bps,
fgs = fgs,
select_groups = select_groups,
select_variable = "N",
check_acronyms = TRUE,
bboxes = boxes)
## [1] "Read /Users/sarah.gaichas/Documents/0_Data/Atlantis/Poseidon/poseidon-dev/atlantisoutput/NOBA_sacc_30/nordic_runresults_01.nc successfully"
unique(testNpools$species)
## [1] "Benthic_filterf" "Sponges" "Corals"
## [4] "King_crab" "Gel_zooplankton" "Large_zooplankton"
## [7] "Medium_zooplankton" "Small_zooplankton" "Dinoflag"
## [10] "Small_phytop" "Large_phytop" "Predatory_ben"
## [13] "Detrivore_ben" "Pel_bact" "Ben_bact"
## [16] "Ref_det" "Lab_det" "Carrion"
I don’t think we have a function that aggregates over layers and converts N to biomass but there are pieces in other functions.
New atlantisom function: calc_biomass_pool
based on calc_biomass_age, but adding the expansion to
volume used in calc_pred_cons.
# we need the volume of each layer as input to this
vol <- load_nc_physics(dir = dir,
file_nc = nc_out,
physic_variables = "volume",
aggregate_layers = FALSE,
bboxes = boxes)
# we need area for the benthic inverts instead of volume
# THANK YOU JOE for this tip
area <- load_boxarea(dir = dir,
file_bgm = "Nordic02.bgm")
# we need a list of which group types to expand to area instead of volume
benthos <- c("SED_EP_FF", "SED_EP_OTHER", "SED_BAC",
"MOB_EP_OTHER", "LG_INF", "SM_INF",
"MICROPHYTBENTHOS", "SEAGRASS")
calc_biomass_pool <- function(pooln, vol, area, fgs, biolprm){
datalist <- list(pooln)
# Conversion factor from mg N to t wet-weight
# should only use conversion for non vertebrates
# also need volume of cell info
bio_conv <- biolprm$redfieldcn * biolprm$kgw2d / 1000000000
# get fgs info
# use grouptype column to allocate
colnames(fgs) <- tolower(colnames(fgs))
# check for GroupType or InvertType
if (!("grouptype" %in% colnames(fgs) | "inverttype"%in% colnames(fgs))) {
stop(paste("The columns GroupType or InvertType ars not in your functional\n",
"groups file."))
}
# change inverttype to grouptype, contents should be the same
if("inverttype" %in% names(fgs)) names(fgs)[names(fgs) == 'inverttype'] <- 'grouptype'
fgs$grouptype <- tolower(fgs$grouptype)
pooltype <- fgs |>
dplyr::select(species=name, grouptype) |>
dplyr::mutate(pooltype = dplyr::case_when((grouptype %in% c("lg_inf",
"sm_inf",
"sed_bact",
"sed_ep_ff",
"sed_ep_other",
"mobile_ep_other",
"coral",
"sponge")) ~ "benthos",
TRUE ~ "nonbenth"))
data_names <- c("species", "agecl", "polygon", "layer", "time", "atoutput")
if (all(sapply(datalist, function(x) all(is.element(names(x), data_names))))){
names(vol)[names(vol) == "atoutput"] <- "vol"
sedlayer <- max(vol$layer)
pooln <- merge(pooln, vol,
by = c("time", "polygon", "layer"))
pooln <- merge(pooln, area,
by = c("polygon"))
pooln <- merge(pooln, pooltype,
by = c("species"))
#pooln$atoutput <- with(pooln, vol * atoutput * bio_conv)
pooln <- pooln |>
dplyr::mutate(atoutput = dplyr::case_when(pooltype == "nonbenth" ~ vol * atoutput * bio_conv,
pooltype == "benthos" ~
ifelse(layer==sedlayer, area * atoutput * bio_conv, 0)))
# Sum over layers
biomass_pools <- aggregate(atoutput ~ species + agecl + time + polygon,
data = pooln, sum)
} else {
stop(paste("Dataframe names do not match with", data_names))
}
return(biomass_pools)
}
biolprm <- load_biolprm(dir = dir, file_biolprm = "nordic_biol_incl_harv_v_011_1skg.prm")
## Warning in `[<-.data.frame`(`*tmp*`, , 3:(maxmat + 2), value = structure(list(:
## provided 20 variables to replace 10 variables
biomass_pools <- calc_biomass_pool(pooln = testNpools,
vol = vol,
area = area,
fgs = fgs,
biolprm = biolprm)
Do the new biomass pools match the BiomIndx.txt output
when aggregated over polygon? If so they can be the input to the survey
function for time and spatial subsetting.
aggbiopools <- biomass_pools %>%
dplyr::group_by(species, agecl, time) %>%
dplyr::summarise(totpool = sum(atoutput))
## `summarise()` has grouped output by 'species', 'agecl'. You can override using
## the `.groups` argument.
txtbiopools <- atlantisom::load_bioind(dir, "nordic_runresults_01BiomIndx.txt", fgs) %>%
dplyr::mutate(time = time/73) %>%
dplyr::filter(species %in% unique(aggbiopools$species)) %>%
dplyr::select(species, time, atoutput)
ggplot(txtbiopools, aes(x=time, y=atoutput)) +
geom_line() +
geom_point(data = aggbiopools, aes(x = time, y = totpool), colour = "blue", alpha=0.1) +
theme_bw() +
facet_wrap(~species, scales = "free_y")
Who matches exactly (enough). Patterns in absolute mismatch are interesting, some species are really close and some always way off. Volume wrong, depth layers wrong? Mine are always higher than the txt output when they are off, so maybe too much expansion somewhere.
Joe pointed out that benthic groups should be expanded to area rather than volume. That correction has now been applied, and matches are better, but still diverge over time for some groups.
Groups that match seem more likely to be pelagic (dinoflagellates, all zooplankton, pelagic bacteria, labile and refractive detritus) although large phytoplankton is way off.
mismatch <- txtbiopools |>
dplyr::left_join(aggbiopools) |>
dplyr::mutate(mismatch = atoutput - totpool,
bad = ifelse(abs(mismatch) > 0.01*(atoutput), 1, 0))
## Joining, by = c("species", "time")
ggplot(mismatch, aes(x=time, y=mismatch)) +
geom_line() +
theme_bw() +
facet_wrap(~species, scales = 'free_y')
This is mismatch coded as 0 if less than 1% of txt output, 1 if more. Those exceeding are carrion, predatory benthos, detritivore benthos, king crab, benthic bacteria, large phytoplankton and sometimes small zooplankton.
ggplot(mismatch, aes(x=time, y=bad)) +
geom_line() +
theme_bw() +
facet_wrap(~species)
What is up with the phytoplankton groups? Small matches well visually but large doesn’t. Large phytoplankton is the largest magnitude mismatch across all groups.
ggplot(txtbiopools |> filter(species %in% c("Large_phytop", "Small_phytop")),
aes(x=time, y=atoutput)) +
geom_line() +
geom_point(data = aggbiopools |> filter(species %in% c("Large_phytop", "Small_phytop")),
aes(x = time, y = totpool), colour = "blue") +
theme_bw() +
facet_wrap(~species, scales = "free_y")
Predatory benthos is off by what appears to be a scalar, Detrivore ben converges as they increase.
ggplot(txtbiopools |> filter(species %in% c("Predatory_ben", "Detrivore_ben")),
aes(x=time, y=atoutput)) +
geom_line() +
geom_point(data = aggbiopools |> filter(species %in% c("Predatory_ben", "Detrivore_ben")),
aes(x = time, y = totpool), colour = "blue", alpha=0.1) +
theme_bw() +
facet_wrap(~species, scales = "free_y")
dir <- here("atlantisoutput/NEUSv2.1.0")
file_nc <- "neus_output.nc"
file.nc <- file.path(dir, file_nc)
# Load ATLANTIS output!
at_out <- RNetCDF::open.nc(con = file.nc)
#on.exit(RNetCDF::close.nc(at_out))
# Get info from netcdf file! (Filestructure and all variable names)
var_names_ncdf <- sapply(seq_len(RNetCDF::file.inq.nc(at_out)$nvars - 1),
function(x) RNetCDF::var.inq.nc(at_out, x)$name)
n_timesteps <- RNetCDF::dim.inq.nc(at_out, 0)$length
n_boxes <- RNetCDF::dim.inq.nc(at_out, 1)$length
n_layers <- RNetCDF::dim.inq.nc(at_out, 2)$length
For NEUS Atlantis, these are the biomass pools:
fgs <- atlantisom::load_fgs(here("atlantisoutput/NEUSv2.1.0"), "neus_groups.csv")
pools <- fgs %>%
dplyr::filter(NumCohorts==1) %>%
dplyr::select(Code, Name, LongName, NumCohorts, GroupType)
knitr::kable(pools)
| Code | Name | LongName | NumCohorts | GroupType |
|---|---|---|---|---|
| SCA | Scallop | Sea scallop | 1 | SED_EP_FF |
| QHG | Quahog | Ocean quahog | 1 | SED_EP_FF |
| CLA | Surf_Clam | Atlantic surf clam | 1 | SED_EP_FF |
| BFF | Filter_Other | Other benthic filter feeder | 1 | SED_EP_FF |
| BG | Benthic_grazer | Benthic grazer | 1 | SED_EP_OTHER |
| LOB | Lobster | Lobster | 1 | MOB_EP_OTHER |
| RCB | Red_Crab | Red deep-sea crab | 1 | MOB_EP_OTHER |
| BMS | Macrobenth_Shallow | Shallow macrozoobenthos | 1 | MOB_EP_OTHER |
| ZL | Carniv_Zoo | Carnivorous zooplankton | 1 | LG_ZOO |
| BD | Deposit_Feeder | Deposit Feeder | 1 | LG_INF |
| MA | Macroalgae | Macroalgae | 1 | PHYTOBEN |
| MB | MicroPB | Microphtybenthos | 1 | MICROPHTYBENTHOS |
| SG | Seagrass | Seagrass | 1 | SEAGRASS |
| BC | Benthic_Carniv | Benthic Carnivore | 1 | MOB_EP_OTHER |
| ZG | Gelat_Zoo | Gelatinous zooplankton | 1 | LG_ZOO |
| PL | Diatom | Diatom | 1 | LG_PHY |
| DF | DinoFlag | Dinoflagellates | 1 | DINOFLAG |
| PS | PicoPhytopl | Pico-phytoplankton | 1 | SM_PHY |
| ZM | Zoo | Mesozooplankton | 1 | MED_ZOO |
| ZS | MicroZoo | Microzooplankton | 1 | SM_ZOO |
| PB | Pelag_Bact | Pelagic Bacteria | 1 | PL_BACT |
| BB | Sed_Bact | Sediment Bacteria | 1 | SED_BACT |
| BO | Meiobenth | Meiobenthos | 1 | SM_INF |
| DL | Lab_Det | Labile detritus | 1 | LAB_DET |
| DR | Ref_Det | Refractory detritus | 1 | REF_DET |
| DC | Carrion | Carrion | 1 | CARRION |
Here are the relevant outputs in the output.nc file for these groups:
# return everything in var_names_ncdf matching Name in pools
var_names_ncdf[str_detect(var_names_ncdf, str_c(pools$Name, collapse ="|"))]
## [1] "Carniv_Zoo_N" "Carrion_N" "Deposit_Feeder_N"
## [4] "Diatom_N" "Diatom_S" "DinoFlag_N"
## [7] "Gelat_Zoo_N" "Lab_Det_N" "Meiobenth_N"
## [10] "MicroPB_N" "MicroPB_S" "MicroZoo_N"
## [13] "Pelag_Bact_N" "PicoPhytopl_N" "Ref_Det_N"
## [16] "Sed_Bact_N" "Zoo_N" "Benthic_Carniv_N"
## [19] "Benthic_grazer_N" "Filter_Other_Cover" "Filter_Other_N"
## [22] "Lobster_N" "Macroalgae_Cover" "Macroalgae_N"
## [25] "Macrobenth_Shallow_N" "MicroPB_Cover" "Quahog_Cover"
## [28] "Quahog_N" "Scallop_Cover" "Scallop_N"
## [31] "Seagrass_Cover" "Seagrass_N" "Surf_Clam_Cover"
## [34] "Surf_Clam_N"
NEUS has some Cover types as well as N outputs.
select_groups <- pools$Name
nc_out <- file_nc
bps <- load_bps(dir = dir, fgs = "neus_groups.csv", file_init = "neus_init.nc")
# Get the boundary boxes
allboxes <- load_box(dir = dir, file_bgm = "neus_tmerc_RM2.bgm")
boxes <- get_boundary(allboxes)
testNpools <- load_nc(dir = dir,
file_nc = nc_out,
bps = bps,
fgs = fgs,
select_groups = select_groups,
select_variable = "N",
check_acronyms = TRUE,
bboxes = boxes)
## Warning in load_nc(dir = dir, file_nc = nc_out, bps = bps, fgs = fgs, select_groups = select_groups, : Some selected groups are not active in the model run. Check 'IsTurnedOn' in fgs
## Red_Crab
## [1] "Read /Users/sarah.gaichas/Documents/0_Data/Atlantis/Poseidon/poseidon-dev/atlantisoutput/NEUSv2.1.0/neus_output.nc successfully"
unique(testNpools$species)
## [1] "Scallop" "Quahog" "Surf_Clam"
## [4] "Filter_Other" "Benthic_grazer" "Lobster"
## [7] "Macrobenth_Shallow" "Macroalgae" "Seagrass"
## [10] "Benthic_Carniv" "Carniv_Zoo" "Deposit_Feeder"
## [13] "MicroPB" "Gelat_Zoo" "Diatom"
## [16] "DinoFlag" "PicoPhytopl" "Zoo"
## [19] "MicroZoo" "Pelag_Bact" "Sed_Bact"
## [22] "Meiobenth" "Lab_Det" "Ref_Det"
## [25] "Carrion"
# we need the volume of each layer as input to this
vol <- load_nc_physics(dir = dir,
file_nc = nc_out,
physic_variables = "volume",
aggregate_layers = FALSE,
bboxes = boxes)
# we need area for the benthic inverts instead of volume
# THANK YOU JOE for this tip
area <- load_boxarea(dir = dir,
file_bgm = "neus_tmerc_RM2.bgm")
biolprm <- load_biolprm(dir = dir, file_biolprm = "at_biology.prm")
## Warning in `[<-.data.frame`(`*tmp*`, , 3:(maxmat + 2), value = structure(list(:
## provided 20 variables to replace 10 variables
## Warning in load_biolprm(dir = dir, file_biolprm = "at_biology.prm"): NAs
## introduced by coercion
biomass_pools <- calc_biomass_pool(pooln = testNpools,
vol = vol,
area = area,
fgs = fgs,
biolprm = biolprm)
Do the new biomass pools match the BiomIndx.txt output
when aggregated over polygon? If so they can be the input to the survey
function for time and spatial subsetting.
aggbiopools <- biomass_pools %>%
dplyr::group_by(species, agecl, time) %>%
dplyr::summarise(totpool = sum(atoutput))
## `summarise()` has grouped output by 'species', 'agecl'. You can override using
## the `.groups` argument.
txtbiopools <- atlantisom::load_bioind(dir, "neus_outputBiomIndx.txt", fgs) %>%
dplyr::mutate(time = time/73) %>%
dplyr::filter(species %in% unique(aggbiopools$species)) %>%
dplyr::select(species, time, atoutput)
ggplot(txtbiopools, aes(x=time, y=atoutput)) +
geom_line() +
geom_point(data = aggbiopools, aes(x = time, y = totpool), colour = "blue", alpha=0.1) +
theme_bw() +
facet_wrap(~species, scales = "free_y")
mismatch
mismatch <- txtbiopools |>
dplyr::left_join(aggbiopools) |>
dplyr::mutate(mismatch = atoutput - totpool,
bad = ifelse(abs(mismatch) > 0.01*(atoutput), 1, 0))
## Joining, by = c("species", "time")
ggplot(mismatch, aes(x=time, y=mismatch)) +
geom_line() +
theme_bw() +
facet_wrap(~species, scales = 'free_y')
This is mismatch coded as 0 if less than 1% of txt output, 1 if
more.
ggplot(mismatch, aes(x=time, y=bad)) +
geom_line() +
theme_bw() +
facet_wrap(~species)
Learned that we should not hardcode the sediment layer! Added a sedlayer variable to the function.
BUT. Diatoms are still off so this calculation may not be working for that category.
What is up with the phytoplankton groups? Small matches well visually but large doesn’t. Large phytoplankton is the largest magnitude mismatch across all groups.
ggplot(txtbiopools |> filter(species %in% c("Diatom", "DinoFlag")),
aes(x=time, y=atoutput)) +
geom_line() +
geom_point(data = aggbiopools |> filter(species %in% c("Diatom", "DinoFlag")),
aes(x = time, y = totpool), colour = "blue") +
theme_bw() +
facet_wrap(~species, scales = "free_y")
calc_biomass_pool to run_truthWith an if statement?
Only need species and index here, comps don’t apply.